Search CORE

9 research outputs found

Explicit diversification of event aspects for temporal summarization

Author: Macdonald Craig
McCreadie Richard
Ounis Iadh
Santos Rodrygo L.T.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/04/2018
Field of study

During major events, such as emergencies and disasters, a large volume of information is reported on newswire and social media platforms. Temporal summarization (TS) approaches are used to automatically produce concise overviews of such events by extracting text snippets from related articles over time. Current TS approaches rely on a combination of event relevance and textual novelty for snippet selection. However, for events that span multiple days, textual novelty is often a poor criterion for selecting snippets, since many snippets are textually unique but are semantically redundant or non-informative. In this article, we propose a framework for the diversification of snippets using explicit event aspects, building on recent works in search result diversification. In particular, we first propose two techniques to identify explicit aspects that a user might want to see covered in a summary for different types of event. We then extend a state-of-the-art explicit diversification framework to maximize the coverage of these aspects when selecting summary snippets for unseen events. Through experimentation over the TREC TS 2013, 2014, and 2015 datasets, we show that explicit diversification for temporal summarization significantly outperforms classical novelty-based diversification, as the use of explicit event aspects reduces the amount of redundant and off-topic snippets returned, while also increasing summary timeliness

Enlighten

Diversity and novelty in information retrieval

Author: Altıngövde İsmail Sengör
Can Fazli
Castells Pablo
Santos Rodrygo L.T.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 02/09/2013
Field of study

This tutorial aims to provide a unifying account of current research on diversity and novelty in different IR domains, namely, in the context of search engines, recommender systems, and data streams

OpenMETU (Middle East Technical University)

Explicit web search result diversification

Author: Rodrygo L.T. Santos
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Learning to rank query suggestions for adhoc and diversity search

Author: Macdonald Craig
Ounis Iadh
Santos Rodrygo L.T.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/08/2013
Field of study

Query suggestions have become pervasive in modern web search, as a mechanism to guide users towards a better representation of their information need. In this article, we propose a ranking approach for producing effective query suggestions. In particular, we devise a structured representation of candidate suggestions mined from a query log that leverages evidence from other queries with a common session or a common click. This enriched representation not only helps overcome data sparsity for long-tail queries, but also leads to multiple ranking criteria, which we integrate as features for learning to rank query suggestions. To validate our approach, we build upon existing efforts for web search evaluation and propose a novel framework for the quantitative assessment of query suggestion effectiveness. Thorough experiments using publicly available data from the TREC Web track show that our approach provides effective suggestions for adhoc and diversity search

Enlighten

About learning models with multiple query dependent features

Author: He Ben
Macdonald Craig
Ounis Iadh
Santos Rodrygo L.T.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/07/2013
Field of study

Several questions remain unanswered by the existing literature concerning the deployment of query dependent features within learning to rank. In this work, we investigate three research questions to empirically ascertain best practices for learning to rank deployments: (i) Previous work in data fusion that pre-dates learning to rank showed that while diﬀerent retrieval systems could be eﬀectively combined, the combination of multiple models within the same system was not as eﬀective. In contrast, the existing learning to rank datasets (e.g. LETOR), often deploy multiple weighting models as query dependent features within a single system, raising the question as to whether such combination is needed. (ii) Next, we investigate whether the training of weighting model parameters, traditionally required for eﬀective retrieval, is necessary within a learning to rank context. (iii) Finally, we note that existing learning to rank datasets use weighting model features calculated on diﬀerent ﬁelds (e.g. title, content or anchor text), even though such weighting models have been criticised in the literature. Experiments to address these three questions are conducted on Web search datasets, using various weighting models as query dependent, and typical query independent features, which are combined using three learning to rank techniques. In particular, we show and explain why multiple weighting models should be deployed as features. Moreover, we unexpectedly ﬁnd that training the weighting model’s parameters degrades learned models eﬀectiveness. Finally, we show that computing a weighting model separately for each ﬁeld is less eﬀective than more theoretically-sound ﬁeld-based weighting models

CiteSeerX

Enlighten

Modelling efficient novelty-based search result diversification in metric spaces

Author: Gil-Costa Veronica
Macdonald Craig
Ounis Iadh
Santos Rodrygo L.T.
Publication venue: Elsevier B.V. Published by Elsevier B.V.
Publication date: 31/01/2013
Field of study

AbstractNovelty-based diversification provides a way to tackle ambiguous queries by re-ranking a set of retrieved documents. Current approaches are typically greedy, requiring O(n2) document–document comparisons in order to diversify a ranking of n documents. In this article, we introduce a new approach for novelty-based search result diversification to reduce the overhead incurred by document–document comparisons. To this end, we model novelty promotion as a similarity search in a metric space, exploiting the properties of this space to efficiently identify novel documents. We investigate three different approaches: pivoting-based, clustering-based, and permutation-based. In the first two, a novel document is one that lies outside the range of a pivot or outside a cluster. In the latter, a novel document is one that has a different signature (i.e., the documentʼs relative distance to a distinguished set of fixed objects called permutants) compared to previously selected documents. Thorough experiments using two TREC test collections for diversity evaluation, as well as a large sample of the query stream of a commercial search engine show that our approaches perform at least as effectively as well-known novelty-based diversification approaches in the literature, while dramatically improving their efficiency

Elsevier - Publisher Connector

Information retrieval on the blogosphere

Author: Macdonald Craig
Mccreadie Richard
Ounis Iadh
Santos Rodrygo L.T.
Soboroff Ian
Publication venue: 'Now Publishers'
Publication date: 01/01/2012
Field of study

Blogs have recently emerged as a new open, rapidly evolving and reactive publishing medium on the Web. Rather than managed by a central entity, the content on the blogosphere — the collection of all blogs on the Web — is produced by millions of independent bloggers, who can write about virtually anything. This open publishing paradigm has led to a growing mass of user-generated content on theWeb, which can vary tremendously both in format and quality when looked at in isolation, but which can also reveal interesting patterns when observed in aggregation. One field particularly interested in studying how information is produced, consumed, and searched in the blogosphere is information retrieval. In this survey, we review the published literature on searching the blogosphere. In particular, we describe the phenomenon of blogging and the motivations for searching for information on blogs. We cover both the search tasks underlying blog searchers' information needs and the most successful approaches to these tasks. These include blog post and full blog search tasks, as well as blog-aided search tasks, such as trend and market analysis. Finally, we also describe the publicly available resources that support research on searching the blogosphere

Enlighten

Modelling efficient novelty-based search result diversification in metric spaces

Author: Aurenhammer
Baeza-Yates
Batko
Bozkaya
Burkhard
Capannini
Catalyurek
Chavez
Chavez
Chavez
Chavez
Chávez
Craig Macdonald
Gil-Costa
Iadh Ounis
Kirkpatrick
Mico
Mico
Nene
Novak
Papadopoulos
Rodrygo L.T. Santos
Samet
Skala
Uhlmann
Veronica Gil-Costa
Vidal
Zezula
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Blog track research at TREC

Author: Arguello J.
Bailey P.
Callan J.
Clarke C. L. A.
Craig Macdonald
Iadh Ounis
Ian Soboroff
Java A.
Macdonald C.
Macdonald C.
McCreadie R. M. C.
Ounis I.
Ounis I.
Ounis I.
Rodrygo L.T. Santos
Sayyadi H.
Thelwall M.
Voorhees E. M.
Voorhees E. M.
Voorhees E. M.
Weerkamp W.
Zhang X.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref